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Abstract — A statistical mechanical framework for analyzing 
random linear vector channels is presented in a large system 
limit. The framework is based on the assumptions that the left 
and right singular value bases of the rectangular channel matrix 
H are generated independently from uniform distributions over 
Haar measures and the eigenvalues of H T H asymptotically 
follow a certain specific distribution. These assumptions make it 
possible to characterize the communication performance of the 
channel utilizing an integral formula with respect to H, which is 
analogous to the one introduced by Marinari et. al. in /. Phys. A 
27, 7647 (1994) for large random square (symmetric) matrices. 
A computationally feasible algorithm for approximately decoding 
received signals based on the integral formula is also provided. 

I. Introduction 

In a general scenario for linear vector channels, multiple 
messages are transmitted to the receiver, being linearly trans- 
formed to multiple output signals by a random matrix and 
degraded by channel noises. This yields a complicated depen- 
dence on message variables, which ensures that the problem 
of inferring the transmitted messages from the received output 
signals is non-trivial. In general, inference problems of this 
kind can be mapped to virtual magnetic systems governed 
by random interactions [1|. This similarity has promoted a 
sequence of statistical mechanical analyses of linear vector 
channels in a large system limit from the beginning of this 
century 0, 0, ffl, 0, 0, 0, 0. 

In the simplest analysis, each entry of the channel matrix 
is regarded as an independent and identically distributed (IID) 
random variable. However, such a treatment is not necessar- 
ily adequate for describing realistic systems, in which non- 
negligible statistical correlations across the matrix entries are 
created by spatial/time proximity of messages/antennas or ma- 
trix design for enhancement of communication performance. 
Therefore, the development of methodologies that can deal 
with correlations in the channel matrix is of great importance 
to research in the area of linear vector channels. 

It is intended that the present article should contribute such a 
methodology for application to these communication channels. 
More precisely, we develop a statistical mechanical framework 
for analyzing linear vector channels so that the influence of 
the correlations across the matrix entries can be taken into 
account. The developed framework is applicable not only to 
Gaussian channels of Gaussian inputs [9|, but also general 



memory-less channels of continuous/discrete inputs, which are 
characterized by a factorizable prior distribution. 

This article is organized as follows: In the next section, the 
model of linear vector channels that we focus on herein is 
defined. In section III, which is the main part of the current 
article, an integral formula with respect to large random 
rectangular matrices is introduced. A scheme to assess the per- 
formance of the linear vector channel and a computationally 
feasible approximate decoding algorithm are developed on the 
basis of this formula. The utility of the developed schemes is 
examined in section IV by application to an example system. 
The final section summarizes the present study's findings. 

II. Model definition 

For simplicity, we here assume that all the variables relevant 
to the communication are real; but extending the following 
framework to complex variables is straightforward [10|. Let 
us suppose a linear vector channel in which an input message 
vector of K components, x = (xk), is linearly transformed 
to an M dimensional sequence, A = (A M ), by a K x N 
channel matrix, H = (H^k), as A = Hx. For generality and 
simplicity, we assume a general memory-less channel, which 
implies that an N dimensional output signal vector, y = (y M ), 
follows a certain factorizable conditional distribution as 

JV 

P(y\x; H) = P(y\Hx) = J] P(y M |A M ). (1) 

fj,=i 

In addition, we assume a factorizable prior distribution 

K 

P(x) = Y[P(x k ), (2) 

fe=i 

for x, which may be continuous or discrete. 

An expression of the singular value decomposition of H 

H = UDV T , (3) 

is the basis of our framework. Here, the superscript T de- 
notes the transpose of the matrix to which it is attached, 
D = diag(dfe) is an N x K diagonal matrix composed of 
the singular values dk (k = 1, 2, . . . ,min(iV, K)), where 
min(Y, K) denotes the lesser value of N and K. The values 
dk are linked to the eigenvalues of H T H, Xk, as Xk = d\ for 



k = 1, 2, . . . , min(iV, if). U and V are orthogonal matrices 
of order N x N and K x K, respectively. In order to handle 
correlations in H analytically, we assume that U and V 
are independently generated from uniform distributions of the 
Haar measures of N x N and K x K orthogonal matrices, 
respectively, and that the empirical eigenvalue distribution of 
H T H, K- 1 J2k=i 5 ( A - Afc) = (1 - min(iV, K)/K)8{\) + 
K _1 Y^,k=i ^(^ ~ ^1) conver g es to a certain specific distri- 
bution p(A) in the limit as N and K tend to infinity while 
keeping the load = K/N ~ O(l). Controlling p(X) allows 
us to express various second-order correlations in H. 

III. Analysis 

A. An integral formula for large random rectangular matrices 

With knowledge of H, the receiver decodes y in order to 
infer x, which is performed on the basis of the Bayes formula 



K 



P(x\y;H) 



P{y,H) 



Here, the probability 



N 



K 



P(y; H) = Tr JJ P(y,\A^) J] P(x fc ), 



(4) 



(5) 



k=i 



expresses the marginal probability with respect to y, where 
Tra; denotes summation or integration over the all possible 
states of x. Eq. (0 also serves as the partition function 
concerning the message vector x in statistical mechanics. 

Let us examine statistical properties of Eq. (0 prior to 
analyzing Eq. (|4j. The expression 

N K 

T^exp (iu T Hx) [] HvM II P (^)' < 6 > 



P{y,H) 



fe=i 



is useful for this purpose, where i 



-1, it = (iif,,) and 



P(y, 



(2tt) dA M exp(-iu AI A At )P(y M |A / _ t ) denotes 



the Fourier transformation of likelihood P(y M |A M ). We substi- 
tute H in Eq. (0 by Eq. © and take an average with respect 
to U and V . For this assessment, it is noteworthy that for 
any fixed set of u and x, u = U T u and cc = V T x behave 
as continuous random variables that satisfy strict constraints 
N" 1 ^ 2 = N-^ul 2 = T u and K^\x\ 2 = K-^x\ 2 = T x . 
In the limit N, K — > oo keeping (3 — K/N ~ O(l), which we 
will hereafter assume if necessary, this yields an expression 

1 In (exp (iu^Hx)) = F(T X ,T U ), (7) 

where 777 denotes the average with respect to U and V, and 

1-/3 



Fforj) = Extr |-| <ln(A c A„ + X)) p - 

Phi Ay*} a 

2 2 



In £ — — In 77 %■ 

2 ^ 2 ' 2 



lnA ? 



(8) 



where 



I denotes the average with respect to p(A), while 
Extrg {• • •} represents extremization with respect to 9 ifTTl . 
This formula is analogous to the one known for ensembles of 
random square (symmetric) matrices lfT2ll . |[T3l . fl4l . which 



is closely related to the P-transformation developed in free 
probability theory fl31 . |9l , lfl6l . Several integral formulae 
for large random matrices related to Eq. (0, but for different 
large system limits, are presented in ifTTll . 
Eq. (0 implies 



-ln^P(y;ff) 

= Extr {F{T X ,T U ) + [3A X {T X ) + A U {T U )} , (9) 

where A X (T X ) = Extr f ^ [f^/2+ln (Tt x P^e"^ 2 / 2 )} 
and AJP U ) = Extr fu {p i T u /2 + ln(Trj / , tl P(y|u)e- f "" 2 / 2 )} . 

The normalization constraint Try P(y; H) = 1, in con- 
junction with the extremization in Eq. (0, yields T x = 
Tr x x 2 P(x), f x = 0, T u = and f u = /3(X) p T x . The 
physical implication of these results is that components of 
A = Hx behave as IID Gaussian variables of zero mean and 
variance T u in the large system limit when x is drawn from 
Eq. (0, while U and V are independently generated from the 
Haar measures. 

B. Performance assessment 

Now, we are ready to analyze the typical communication 
performance of the current channel model. This is performed 
by assessing the typical mutual information (per output signal) 
between x and y, I(X, Y), based on Eqs. dU and (0 as 



HX, Y) = — Tr P(y\x; H)P(x) In ( P ^ V } X]H ) 
v ' > Ny.x vy| ' ; v ; V P(y:H) 



•P + Tr / DzP[y\\/T u z)lnP[y\\/T u z) , (10) 



where 



T 



-TrP(y;H)lnP(y;H), 



(11) 



represents the conditional entropy of y, and serves as the av- 
erage free energy with respect to x. Dz = (2ir)~ 1 ' 2 dze^ z I 2 
denotes the Gaussian measure. The statistical properties of A 
evaluated in the last paragraph are employed to assess the 
second term on the right-hand side of the last line of Eq. dTob . 

T can be evaluated by means of the replica method. 
Namely, we evaluate the n-th moment of the partition function 
P(y; H) for n £ N as 



TrP n+1 (y;H) = Tr exp ( i ^)T(u a ) T Hx a 

y (y ' ' y,{x«},{u« } y \ ^ ' 



n+l N 



-1 K 



a— 1 /i— 1 a— 1 k—1 

and assess T as 



d 1 



T= - lim In Tr P n + X (y; H 



(13) 



analytically continuing expressions obtained for Eq. (TT2l from 
n £ N to n £ R. Here, {a; a } denotes a set of n+ 1 replicated 
vectors x°, x 1 , ■ ■ ■ , x n , with {u a } defined similarly. 



Eq. ( fT3l > is generally expressed using F(£,rj), and the 
derivation of the expression can be found in ifTTl . In particular, 
the expression obtained under the replica symmetric ansatz, 
which is believed to be correct for the current case since the 
inference is performed on the basis of the correct posterior (0|i 
ifTH . is given in a compact form as 

F = -Extr{A xu {q x ,q u ) + f3A x {q x ) + A u {q u )} , (14) 



where 



A xu (q x , q u ) = F(T X - q x , q u ) + 



(15) 



A x (q x ) = Extr 

5x 



q x q x 



DzP(z;q x )lnP(z;q x ) } , (16) 



and 



A u {q u ) = Extrj 



Tr 

v 



DzP(y\z;q u )\nP(y\z;q u )\ , (17) 



in which P(z;q x ) — 
V(y\z;q u ) = jD S p(y\\lT ;i 



Tra, P{ x)e- ? * x / 2 +Vfe^ 
q u z ). The q x 



q u s - 



and 
and 



q u determined by Eq. ( TBI represent K^ 1 [| (x) | 2 ] and 
—N^ 1 [| (u) | 2 ], respectively, where (• • ■) denotes averaging 
over the posterior distribution while [• ■ ■] indicates the 
average with respect to y, U and V. These averages, (■ • •) 
and [••■], correspond to the thermal and quenched averages 
in statistical mechanics, respectively. The quantities q x and 
q u appearing in Eqs. (\M and (jT7j can be used for assessing 
performance measures other than Eq. (jTDJ, such as the mean 
square error (MSE) and the bit error rate (BER). 

C. Computationally feasible approximate decoding 

Let us suppose a situation which requires evaluation of the 
posterior average 



TrxP(x\y;H), 



(18) 



where m x = (m x k), with similar notation used for other 
vectors below. Eq. ( TT8l serves as the estimator that minimizes 
the MSE in general, and can be used to minimize the BER 
for binary messages. Exact assessment of such averages is, 
however, computationally difficult for large systems, which 
motivates us to develop computationally feasible approxima- 
tion algorithms |fl9l , Q, EDI . A generalized Gibbs free energy 



&(m x ,m u ; I) = Extr {h x ■ m x + h u ■ m t 



-In 



(Z(h x ,h u ;l))}, 



(19) 



where Z(h x ,h u ;l) = Tr x ,u U^=i P{y l i\ u ^) x Hk=i p ( x k) 
x exp (h x ■ x + h u ■ (iu) + (iu) T (lH)x), offers a useful 
basis for this purpose since Eq. ( fT8l is characterized as the 
unique saddle point of Eq. (fl9t for I = 1 IHiI l22l . 



tK 



MPforPerceptron { 



Perform Initialization; 

Iterate H-Step and V-Step alternately sufficient times; 



} 

Initialization! 



X* 



1 



m xk ^Trx k P(x k ) (k = 1, 2, . . . , K) 
h u <- Hm x ; m u <— 0; 



0; A x < %; 

Xx 



} 

H-Step{ 



Search (xu, A u ) for given (xx, A x ) to satisfy conditions 
Xx = ( A A~~~~i \ ) and X 



A^A,, + A 



1-0 / PA, 



A u 



A X A U + A 



Xu < 

Xu _ 

h u <— h u — Xutriu; 

■In I I DxP(y u \^x~ u x + h uu ) 



h x 

Xu 



— u- 1 

1 N d 2 f f 

u=l u ^ yJ 



A u < Xu, 

Xu 



} 

V-Step{ 



Search (xx,A x ) for given (xu,A u ) to satisfy conditions 
_/ A tt \ _ 1 - p , / pA x 

Xx -\AxAu + x) and X "-^T + \A^TA 

1 A 

x«< * A ^ ; 

Xx _ 
h x ^ hx + Xxm x ; 



rrixk 



-^-ln (TrP(:r) e ~^ a:2+ ' 1 *'= 3; ) 
(k = l,2,...,K); 



h u <— Hm x ; 



1 

7?£ 



fc=i 
1 - 

A x < Xi 

Xx 



^- In (TrP(s)e-i Xa;3;2+ ' l - fca;N 



} 



Fig. 1. Pseudocode of the message-passing algorithm MPforPerceptron 
[23]. The symbols ";" and "<— " represent the end of a command line and 
the operation of substitution, respectively. The quantities A x and A u are the 
counterparts of and A,, in Eq. (8) for £ = \x and rj = \u, respectively. 



Unfortunately, the evaluation of Eq. $1% is also computa- 
tionally difficult. One approach to overcoming this difficulty 
is to perform a Taylor expansion around I = 0, for which 
$(m x , m u ; I) can be analytically calculated as an exceptional 
case, and substitute I = 1 in the expression obtained 12D . 
However, the evaluation of higher-order terms, which are not 
negligible in general, requires a complicated calculation in this 



expansion, which sometimes prevents the scheme from being 
practically feasible. In order to avoid such difficulty, we take 
an alternative approach here, which is inspired by a derivative 
of Eq. CG3, 



d$(m x ,m u ;l) 
dl 



((iufHx)., 



(20) 



following a strategy proposed by Opper and Winther 
[22 1 . Here, (•••); represents the average with respect to 
the generalized weight J]^=i p {Vn\ u n) x Hk=i P ( x k) x 
exp (h x ■ x + h u ■ (itt) + (itt) T {lH)x), in which h x and h u 
are determined so as to satisfy (x) l = m x and ((iu.)); = m u , 
respectively. The right-hand side of this equation is an average 
of a quadratic form composed of many random variables. 
The central limit theorem implies that such an average does 
not depend on the details of the objective distribution, but is 
determined only by the values of the first and second moments. 
In order to construct a simple approximation scheme, let us 
assume that the second moments are characterized macroscop- 
ically by (\x\') l - | {x) l | 2 = K Xx and <|«| 2 ) ; - | («), | 2 = 



N\u- Evaluating the right-hand side of Eq. ( IZUD using a 
Gaussian distribution, the first and second moments of which 
are constrained to be identical to those of the generalized 
weight, and integrating from Z = to I = 1, we have 

®(Xx, Xu, m x , m u ; 1) - Xu, m x , m u ; 0) 

~-mlHm x -NF{ Xx ,Xu), (21) 

where the function F((_, rj) is provided as in Eq. ([H) 
by the empirical eigenvalue spectrum of H T H, p(X) = 
K^ 1 ^2^ =1 S(X — Afc) and the macroscopic second moments 
Xx and Xu are included in arguments of the Gibbs free 
energy because the right-hand side of Eq. d20l > depends on 
these moments. Eq. (fJTJ offers a computationally feasible 
approximation of Eq. ( [T9| > for I = 1, since assessment of 
$(Xxj Xu, mx> mu> 0), in which one can perform summa- 
tions with respect to relevant variables independently, can be 
achieved at a reasonable computational cost. 

Although evaluation of Eq. (fJTJ is computationally feasible, 
searching for saddle points of this function within a practical 
time is still a non-trivial problem. In Fig. Q] we present a 
message-passing type algorithm, which was recently proposed 
for a classification problem of single layer perceptrons [23 1, 
as a promising heuristic solution for this problem. 

The efficacy of this method under appropriate conditions 
was experimentally confirmed for the perceptron problem, and 
to the extent to which it has been applied to several ensembles 
of linear vector channels, this algorithm has also been shown 
to exhibit a reasonable performance for the current inference 
task as well. However, its properties including convergence 
conditions have not yet been fully clarified, and, therefore 
further investigation is necessary for the theoretical validation 
and improvement of the performance of this method. 

IV. Example: Welch bound equality sequences 

In order to demonstrate the utility of the proposed approach, 
let us apply the current methodologies to the analysis of 



the matrix ensemble that is characterized by p(A) = (1 — 
j3' 1 )S(X) + (3~ 1 S(\ - (3) under the assumption (3 > 1, which 
corresponds to the case of Welch bound equality sequences 
(WBES) [24]. We focus on the case of the Gaussian channel 



P(y\A) = (27rcr 2 )- 1 / 2 exp(-(y- A) 2 /(2cr 2 )) and binary 
inputs x G {+1,-1}^, since this constitutes a simple, yet 
non-trivial problem. Under these assumptions, the developed 
framework has a higher capability than is required for the 
assessment of the typical communication performance with 
respect to the matrix ensemble, which can be carried out by 
a simpler method developed by the author and his colleagues 
[25 1, [10 1, as was recently shown by Kitagawa and Tanaka 
1 26 1 . Nevertheless, the framework is still useful as one can 
derive a computationally feasible approximate decoding algo- 
rithm of good convergence properties based on the procedure 
shown in Fig. [T] 

For Gaussian channels, A u in Fig.[T]can be fixed as A u = a 2 
in general. This yields an algorithm 

1 (y - Hm x + xlmi) , (22) 



m 



t+i 



in 



t+i 




tanh 



t+i 



Xx m xk 



(23) 



1,2,. ..,K), 



for WBES, where t denotes the number of iterations. x u m 
Eq. ( 1221 is provided as x u — P/^x, where A x is determined 
so as to satisfy Xx = (1 - + P^^/i^K + P) for 

given x x — 1 ~ -^~ 1 | Tn 4l 2 - Utilizing the identical A^., x x +1 
in Eq. ( 1231 is evaluated as x x +1 = VXa, — ^L- 

Fig. |2] compares the BER for the theoretical assessment by 
the replica method with the experimental evaluation obtained 
by the algorithm of Eqs. ( f22l and (f23b . In the experiments, 
the estimates of the binary messages are computed as Xk = 
sign(m. x k) for k = 1.2, . . . , K, where sign(a) = a/\a\ for 
a ^ 0. This decoding scheme is optimal for minimizing BER 
if m x represents the correct posterior average dT~8T > [27 1 . The 
excellent agreement between the curves and markers in this 
plot validates both the performance assessment based on the 
replica method and that based on the developed algorithm. A 
characteristic feature of Eqs. ( l22l and d23l is the inclusion 
of macroscopic variables x\i an d which are expected 

to act to cancel the self -reactions from previous states. (28). 
Fig. [3] plots the influence of this operation, indicating that 
the cancellation acts to maintain the quality of the converged 
solution up to larger (3 under a condition of fixed SNR. 

V. Summary 

In summary, we have developed a framework to analyze 
linear vector channels in a large system limit. The frame- 
work is based on the assumptions that the left and right 
singular value bases of the channel matrix can be regarded 
as independently drawn from Haar measures over orthogonal 
(unitary, if the number field is defined over the complex 
variables) groups, and that the eigenvalues of the cross corre- 
lation matrix of the channel matrix asymptotically approach 
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Fig. 2. BER vs. signal-to-noise ratio (SNR) for binary inputs for the case 
P = 1.1. The SNR plotted on the horizontal axis is given by — 10 log 10 (2<r 2 ) 
while the vertical axis denotes the BER. The curves indicate theoretical 
predictions, which correspond to the scalar Gaussian channel, WBES and 
the basic matrix ensemble (BASIC) from the bottom. Sample matrices of 
BASIC are composed of IID entries of zero mean and l/N variance Gaussian 
random variables. Values for WBES and BASIC are assessed by the replica 
method. The markers indicate experimental estimates of the BER obtained 
from 500 sample systems with K = 2048 and N = 1862 on the basis of 
the algorithm shown in Fig. [T] Excellent agreement between the curves and 
markers validates both the performance analysis based on the replica method 
and that of the developed approximation algorithm. 

a certain specific distribution in the limit of large matrix 
size. These modeling assumptions allow a characterization of 
the system in terms of an integral formula in two variables, 
which is fully determined by the eigenvalue distribution. Upon 
applying this formula in conjunction with the replica method, 
we have derived a general expression for the typical mutual 
information of general memory-less channels with factorizable 
priors of continuous/discrete inputs. We have further proposed 
a computationally feasible decoding algorithm based on the 
formula, and have found that numerical results obtained from 
this algorithm are in excellent agreement with the theoretical 
predictions evaluated by the replica method. 

Future research directions include the application of the 
developed framework to various models of linear vector chan- 
nels, and further improvement of the computationally feasible 
decoding algorithm. 
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